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Objectives for Intelligent Payload Module Testbed 


• Low power/high performance benchmarking 

- Test various typical onboard science processing requirements 
with parallel processing via multicore processors and field 
programmable gate array circuits 

- Target processors that can be radiation tolerant or radiation 
hardened 

• Airborne Intelligent Payload Module(IPM) box used as 
proxy for satellite version of IPM 

• Research being conducted under AIST-11 effort, "A High 
Performance Onboard Multicore Intelligent Payload 
Module for Orbital and Suborbital Decadal Missions" 


Potential Users of IPM 


HyspIRI Smallsat mission 

- Visible Shortwave InfraRed (VSWIR) Imaging Spectrometer 

- Multispectra I Thermal InfraRed (TIR) Scanner 

HyspIRI Space Station mission 

- Visible Shortwave InfraRed (VSWIR) Imaging Spectrometer 

- Multispectral Thermal InfraRed (TIR) Scanner 

Geocape 


Sample Operational Scenario: Detection of Harmful Algal Blooms 
with Rapid Map Downlinked to Validation Team on Ground 



Realtime map with following 
processing steps: 

'^Radiance to reflectance conversion 
Atmospheric Correction 
Geocorrection/Co-registration 
Ciassification (Web Coverage 
Processing Service) 



Harmful Algal Bloom 





Processors Used in Conjunction With 

The Testbed 


Tilera Tile64 as Proxy for Maesto 


Specifications 

• Launched on August 20, 2007 

• 8 X 8 tile array (64 cores) 

• Each tile on chip is an independent processor 
capable of running an entire operating system 

• 700MHz -866MHz (No FPU) 

• 15 - 22W @ 700MHz all 64 cores running 

• Idle tiles can be put into low-power sleep mode 

• ANSI standard C / C++ compiler 

• Supports SMP Linux with 2.6 kernel 


Issues 
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Special data homing considerations required when programming/compiling for the TILE64 


- Uses a crude coherence strategy; each shared memory location may only be cached in one tile - its "home" tile. A 
location's home tile is fixed at runtime 

- Accessing remotely-cached data is correct, but performance is low 

- Prevents TILE64 from efficiently running existing generic multithreaded code 

- Careful "homing" of data is crucial to good scalability 


• TILE64's compiler does not use the now-standard C++ ABI popularized by GCC 3.2+ 

- This compiler is closed-source, based on SGI's "MIPSPro" 

- Prevents linkage with and preprocessing by other C++ compilers, such as AESOP 
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Maestro as Proxy for Maestro-lite 

• Origin - DARPA Polymorphic Computer Architecture (PCA 
Program) 

• DARPA/DTRA Radiation Hardened By Design (RHBD) 90 nm IBM 
CMOS process 

• Government purchased Tilera Corp's (commercial 64 core 
processor) software intellectual property (IP) for government 
space-based applications 

• Program managed by National Reconnaissance Office (NRO) 

• Maestro Chip developed by Boeing Solid-State Electronics 
Development (SSED) 

• Government customers: NASA, NRO, Air Force Research 
Laboratory 

• Maestro basic specifications 

- 7 X 7 tile array (49 cores) 

- 300 MHz, 45 GOPs, 22 GFLOPS (FPU on each tile) 

- 18 Watts typical 

- RHBD Total Ionizing Dose (TID) >500krad 
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Tilera TilePro64 as Proxy for Maestro 

Specifications 

• Launched on September 22, 2008 

• 8 X 8 tile array (64 cores) 

• Each tile on chip is an independent processor 
capable of running an entire operating system 

• 700MHz - 866MHz (No FPU) 

• 19 - 23W @ 700MHz all 64 cores running 

• Idle tiles can be put into low-power sleep mode 

• ANSI standard C / C++ compiler 

• Supports SMP Linux with 2.6 kernel 

Addressed issues exhibit in TILE64 

• Uses a better cache coherence protocol allowing many tiles to cache the same 
data 

• The native compiler is now an open-source port of GCC 4.4, using standard C++ 
ABI 

■ Compiler and toolchain is actively supported 

■ October 14 2011 - Tilera contributed its port back to the GCC project 

• First-class Linux kernel architecture 

• Presently in IPM used during recent flights 

Goddard Space Flight Center 
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Tilera Tile-Gx8036 / Tile-Gx8009 as Proxy 

for Maestro 

Specifications 










Launched on January 30, 2012 

6x6 tile array (36 cores) / 3 x 3 tile array (9 cores) 

Each tile on chip is an independent processor 

capable of running an entire operating system 
IGHz- 1.5GHz (FPU) 


27 - 30W @ 1.2GHz all 36 cores running 
9 - low @ l.OGHz all 9 cores running 
Idle tiles can be put into low-power 
sleep mode 

ANSI standard C / C++ compiler 
Supports SMP Linux with 2.6 kernel 



USB 2.0 


1 Gb/lO 4Sb 
Ethernet Fort 


1 Gb/10 Gb 
Itberofit Port 


1 Gb/10 Gb 
Etbernet Port 


1 Gb/lO Gb 
Etbsrnet Port 




PCltS-lane Genl/Gen2 
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SpaceCube 1.5 


Unit 

Mission 

Notes 

Specs 

Stats 

Status 

SpaceCube 

1.0a 

Hubble SM 4 

RNS Experiment 
STS-125 May 2009 

4”x4” card 
(2) Virtex4 

Size: 5”x5”x7” 
Wt: 7.5 lbs 
Pwr: 16Wx2 

2009 Flight 

SpaceCube 

1.0b 

MISSE-7/8 

added RS-485, RHBS, 
STS-129 Nov 2009 

4”x4” card 
(2) Virtex4 

Size: 5”x5”x7” 
Wt: 7.5 lbs 
Pwr: 16Wx2 

Operating on 
ISS Since Nov 
2009 

SpaceCube 

1.5 

SMART 

added GigE & SATA 
SubTec-5 Jun 2011 

4”x4” card 
(1) Virtex5 

Size: 5”x5”x4” 
Wt: 4 lbs 
Pwr: low 

2011 Flight 

SpaceCube 

1.0c 

Argon Demo 

added 1553 & 
Ethernet 

4”x4” card 
(2) Virtex4 

Size: 5”x5”x7” 
Wt: 7.5 lbs 
Pwr: 1 8W x 2 

Demonstration 

Testbed 

SpaceCube 
1.0 d, e, f 

STP-H4, future 
STP-H5 & 
RRM3 

added 1553 & 
Ethernet 

4”x4” card 
(2) Virtex4 

Size: 5”x5”x7” 
Wt: 7.5 lbs 
Pwr: 15W 

On ISS Since 
Aug 2013 

SpaceCube 

2.0 

Earth/Space 
Science, 
SSCO, GPS 
Nav 

Std 3U form factor, 
GigE, SATA, 
Spacewire, cPCI 

4”x7” card 
(2) Virtex5 + (1) 
Aeroflex 

Size: 5”x5”x7” 
Wt: < 10 lbs 
Pwr: 15-20W 

EM On ISS Since 
Aug 2013 (Flight 
Unit In 

Development) 

SpaceCube 
2.0 Mini 

CubeSats, 
Sounding 
Rocket, UAV 

“Mini” version of 
SpaceCube 2.0 

3.5”x3.5” card 
(1) Virtex5 + (1) 
Aeroflex 

Size: 4”x4”x4” 
Wt: < 3 lbs 
Pwr: 8W 

Flight Unit in 
Development 
(2016 launch) 
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2012 SMART 


2013 STP-H4 
2016 STP-H5 


2009 STS-125 

2009 MISSE-7 
2013 STP-H4 
2016 STP-H5 

SCIENCE DATA PROCESSING BRANCH • Code 587 • NASA GSFC 


2015 GPS Demo 

- Robotic Servicing 

- Numerous proposals 
for Earth/Space/Helio 





Processor Comparison 


Processor 

MIPS 

Power 

MIPS/W 


MIL-STD-1750A 

3 

15W 

0.2 


RAD6000 

35 

15W 

2.33 


RAD750 

300 

15W 

20 


LEON 3FT 

75 

5W 

15 


LEON3FT Dual-Core 

250 

low 

25 


BRE440 (PPC) 

230 

5W 

46 


Maxwell SCS750 

1200 

25W 

48 


SpaceCube 1.0 

3000 

7.5W 

400 

SpaceCube 2.0 

6000 

low 

600 


SpaceCube Mini 

3000 

5W 

600 
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ZC702 - Zynq (ARM/FPGA Processor) Proxy for 
COTS+RH+FTC CHREC Space Processor (CSP) 


COTS RadHard 


■ Zynq-7020 hybrid SoC ■ 

□ Dual ARM A9/NEON cores ■ 

□ Artix-7 FPGA fabric + hard IP ■ 

■ DDRS memory ■ 


NAND flash 
Power circuit 
Reset circuit 
Watchdog unit 



FTC = Fault-Tolerant Computing 

■ Variety of mechanisms 


■ External watchdog unit to monitor Zynq health and reset as needed 

■ RSA-authenticated bootstrap (primary, secondary) on NAND flash 

■ ECC memory controller for DDRS within Zynq 

■ ADDAM middleware with message, health, and job services 

■ FPGA configuration scrubber with multiple modes 

■ Internal watchdogs within Zynq to monitor behavior 

■ Optional hardware, information, network, software, and time redundancy 




NSF Center for High-Performance 
Reconfigurable Computing 
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IPM Hardware 


• 14 X 14 X 6 inches 

• Wide-Input-Range DC voltage (6V-30V) 

• Made of strong durable aluminum alloy 

• Dual mounting brackets 

• Flush design 

• Removable side panels 

• Mounting racks are electrically isolated from the box 

• Appropriate space allocation for interchangeable Tilera 

• and SpaceCube boards 

• Electronic components 

- Tilera development board 

- SpaceCube development board 

- Single board computer 

- 600GB SSD 

- Gigabit Ethernet switch 

- Transceiver radio 

- Power board 



Goddard Space Flight Center 
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IPM Hardware 



Compact Hyperspectral Advanced Imager (CHAI V640) 


SPECIFICATIONS 


MECHANICALS 

ESTIMATE 

Size (with Jensl 
Size (with teies-cope) 

125 X 101 X 75 mm 
200 X 101 X 75 mm 

Weight 

.45 kg [.99 ibsj 

Power 

20 watts 

Temperature Range 

-20 to +50 G 

S/ze do&s tncfade NS/GPS 


OPTICS 

5PECEFICATION 

S pectromete r Ty pe 

Dyson 

Telescope 

All-reflective telescope 

Field of View 

40 degrees 

Gross Track Pixels 

640 

F- Number 

f/2 

Spectral Range 

350-1 CiflO nm (Reflective) 
400-1000 nm (Refractive) 

Smile Distortion 

< 0.1 pixeis 

Keystone Distortion 

< 0.1 pixeis 

Stray Light 

< 1e-4 Point Source 
T ransmission 

Spectra f Bands 

256 

Spectral Sampling 

2.5f 5, 10 nm 

Peak Grating Efficiency 

ee% 

Slit Size 

9.5 X .015 mm 


lEUIAGE SENSOR I 

image Sensor 

640 X 512, 
with 15 pm pixels 

Full Weii Capacity 

Gain 0: 500,000 
Gain 1: 60,000 
Gain 2: 10,000 

Read Noise 

! 

Gain 0: < 63 electrons 
Gain 1: < 42 electrons 
Gain < 10 electrons 

Maximum Frame Rate 

1 000 frames/second 

Quantum Efficiency 

1 

> 50% @ 380 nm 
60% @ 400-900 nm 

> 30% @ 1000 nm 

Camera Interface 

USB 3 

1 Data Acquisiflon 

i 

500 MB Solid State 

Recorder 

Serial Interface for 

GPS/INS 


1 CHAI SOFTWARE 

Trigger Modes 

Piiot, GUI, electronic, 
and Lat/Long triggered 
acquisition 

Visualization 

3- band RGB waterfall 
dispiay of reai-time and 
recorded data 

Metadata 

Temperature^ pressure, 
and humidity 

Data Format 

RAW, ENVI BIL. or 
Processed 

Processing 

EXPRESSO'^ 



from Brandywine 
Electronics 
Contact: John Fisher 


12 / 5/2014 


Goddard Space Flight Center 


16 











ChaiV640 Box and IPM (Tilera 
Multicore Proxy for Maestro) 


Free wave 

Transceiver 
Tilera TllePro 

ChaiV640/IPM on Bussmann Helo Takeoff 


SDN500 

IMU 


ChaiV640Box — n ^ 

Brandywine 

ChaiV640 

ChaiV640 Box Mounted on Helo 
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Software with Addition of Publisher Node 

Onboard IPM 


Publisher 



FLAASH 

Atmospheric 

Correction 


Geocorrection 
for Airborne 
Platforms (GCAP) 



Ground 



Level 0 & 


Level 2 - e.g. 

Level 1 


Spectral Angle 

Processing 


Mapper 


WCPS - Web Coverage Processing Service 
Cmdin - Command Ingest 
TImOut - Telemetry Output 

CASPER - Continuous Activity Scheduling Planning Execution and 

Replanning system 

CFDP - CCSDS File Delivery Protocol 

ASIST - Advanced Spacecraft Integration and System Testing Software 
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Data Processing Chain for Benchmarking 


Main Data Source 


CHAI v640 
on Bussmann 
Helicopter 


170 



Ingest/ 
Level 0 


Alternative Data Sources 


AMS 

on Citation 
Forest Service 

GLiHTon UC-12 
Langley 

CHAI v640 
on UC-12 
Langley 

EOl ALI and 
Hyperion Data 


Level 


Level 

IR 


FLAASH 

AC 



Level 

IG 



WCPS 



Radiometric Geometric 

Correction Correction 

Atmospheric 

Correction 


Classifiers 
& Other 
Algorithms 


2 

SAM 

Level 

2 

Vector 

izer 



Downlink high level 
data products to 
ground at 200 kbps 


Goddard Space Flight Center 
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Satellites 





V 



I 


[Big] Data 


I 


Societal 

Products 


Publisher/Consumer/GeoSocial API 

Architecture 


Intelligent Payload 
Module in S/C 


Publisher 







API 


API 




t 


I 


Mobile 

Application 


I*" 

Social Networks 


A 



Ground 


Regional/ 

Consumer 

Node 


API 


Web 

Application 


Publisher 


Concept developed by Pat Cappelaere(Vightel/GSFC) 
and Stu Frye (SGT/GSFC) 


A methodology to rapidly discover, obtain and distribute satellite 
data products via social network and open source software 
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Initial Hyperspectral Image Processing 

Benchmark 


Radiometric *Atmospheric Geometric *WCPS 

Correction Correction (FLAASH) Correction (vis_composite) 

(GCAP) 


864 MHz 
TILEPro64 
(1 core) 

121.95 

2477.74 

183.42 

72.39 

864 MHz 
TILEPro64 
(49 cores) 

23.83 

TBD 

4.59 

21.63 

1.2 GHz 
TILE-Gx36 
(1 core) 

57.22 

897.71 

28.51 

19.93 

1.2GHz 
TILE-Gx36 
(36 cores) 

9.21 

TBD 

1.41 

8.72 

2.2GHz Intel Core 
17 

2.09 

58.29 

0.169 

2.26 

Virtex 5 FPGA 

TBD 

TBD 

TBD 

TBD 


Image data: GLiHT 1 004 x 1 028 x 402 
Hyperion 256 x 6702 x 242 
Chai640 696x2103x283 


(829,818,048 bytes) 

(830,404,608 bytes) 

(828,447,408 bytes) 

Goddard Space Flight Center 


Notes: Unit is in seconds 
TILEPro64 - No floating point support 
TILE-GX36 - Partial floating point support 
* Indicates time includes file I/O 
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Detail on Benchmarking of 
Atmospheric Correction 



Worked with Spectral Sciences to 
modify FLAASH GLUT version to 
support airborne atmospheric 
correction. 

Optimized FLAASH to run on the 
multicore Tilera processor. 

Processed CHAI v640 data with 
FLAASH to create reflectance valu, 


Using this subroutine 
which takes about 20% of 
processing time and is a 
Fast Fourier Transform to 
benchmark FPGA 
acceleration. 



Cube reduction 
Cube load Sdistrib 
Cube gather a write 
Aerosol Retrieval 
Water GO I retrieval 
Spectral Polishing 
Sensor calibration 
Images and Masks 


■ectral Resamaiill 


Cloud Masking 
Sensor slit function 
Modtran Tables 
un- categorized 
Flaash setup 
total time 

total timeth:m:s^ 

Original Wall time: 


FLAASH Parallelization Effort 


wall 


system 


user 


notes Parallelized? 


Reflect: 


_YES_ 


Original 

Walltime 

1B^7 . 317 


Parallel 

Speedup 

5.D 1 5-5 42Hi 








Cubesmoothing 

145.741 

18.1373 

127.6037 >Smooth;FFT ■ 


112.363 
B9.012B 
B3.29BB 
£2.6236 
34.79B4 
33.5795 
2 B. 5097 
17.1781 


13.9B34 
1 1 .0775 
10 . 3 ^ 
6.54B92 
4.3306 
4.17B91 
3.54799 
2.13779 


9B.3796 

77.9353 

72.9324 

46.0746B 

30.467B 

29.40059 

24.96171 

15.04031 


>Con dense 


8.72743 


1.C-3511 


Smile_Re5ampl 
7.54132 er::Cube Coai 


YES 


241.; 


2 7. 3^374351 


5.932B7 
0.764612 
0.416416 
0.0397966 
0.000163794 
645.78 13BB4 
0:10:46 

1060 

0:17:40 


0.738336 
0.0951547 
0.051 B223 
0.00495262 
2.04E-05 
81.425006 
0 : 01 :21 


5.194534 
0.6694573 
0.3645937 
0.034B439B 
1 .43E-04 
564.3563B24 
0:09:24 


1.641422344 


36 
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FFT Benchmark Tests with Various CPU 

Processors and FPGA 


Processor 

Cores 

FFTW 1 band 
128 X 256 time 
(Msec) 

Clock rate (Mhz) 

Power 

Consumption 

(watts) 

TileGX 

1 

21.3 



TileGX 

4 

10.0 



Maestro 

1 

187 

200 

14 watts 

Maestro 

8 

55 

200 

14 watts 

ZynqARM 

1 

8.7 

667 

3 watts 

ZynqARM 

2 

6.9 

667 

3 watts 

XeonPhi 

1 

9.0 



XeonPhi 

171 

0.221 


225 watts 

FPGA 

NA 

1.5 

100 

<3 watts 
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Goal 

• Experimenting with putting almost all of the data processing 
chain in FPGA using the Zynq based ZC702 (proxy for CSP) to 
do the benchmark 

• Install ZC702 in IPM and fly on helicopter as part of our flight 
tests 

• Issues 

— Moving data between programmable logic, processor system and 
memory 

— Design of data processing chain buffering scheme 

• Based on DMA access, throughput speed of as much as 10 
Gbps might be possible 

• Would like to demonstrate producing high level data products 
while keeping up with an input instrument data rate of 
between 500- 1000 Mbps 
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CHREC Space Processor (CSP) Missions 


• CSP Tech Demo ISIM (Space Station) 

- 2 CSP's 

- Targeted to be on Space Station Summer 2015 

- Gary Crum/587 

• Compact Radiation BEIt Explorer (CeREs) is part 
of NASA's Low-Cost Access to Space program 

- 3U Cubesat 

- ICSP 

— Launch May 2015 
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Conclusion 

• Working towards IPM and GeoSocial API 
integrated architecture 

• Working towards radiation tolerant IPM 

• Prototype how much of the flight software and 
data processing software can be hosted 

• Measure relative throughput performance of 
representative data processing chain 

• Present AIST-11 effort is working mostly in 
multicore processor environment and only begins 
to explore FPGA performance 
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